| system_calendar_key_N | product_id | sales_dollars_value | sales_units_value | sales_lbs_value | Vendor | Claim_id | Claim_name | date | platform | searchVolume | week_number | year_new | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 20160109 | 1 | 13927.000000 | 934 | 18680 | Others | 0 | No Claim | NaT | nan | nan | nan | nan |
| 1 | 20160109 | 3 | 10289.000000 | 1592 | 28646 | Others | 0 | No Claim | NaT | nan | nan | nan | nan |
| 2 | 20160109 | 4 | 357.000000 | 22 | 440 | Others | 0 | No Claim | NaT | nan | nan | nan | nan |
| 3 | 20160109 | 6 | 23113.000000 | 2027 | 81088 | Others | 0 | No Claim | NaT | nan | nan | nan | nan |
| 4 | 20160109 | 7 | 23177.000000 | 3231 | 58164 | Others | 0 | No Claim | NaT | nan | nan | nan | nan |
| system_calendar_key_N | product_id | sales_dollars_value | sales_units_value | sales_lbs_value | Vendor | Claim_id | Claim_name | date | platform | searchVolume | week_number | year_new | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4526177 | 20181027 | 47536 | 8.000000 | 2 | 3 | Private Label | 0 | No Claim | NaT | nan | nan | nan | nan |
| 4526178 | 20181027 | 47539 | 391.000000 | 39 | 68 | Private Label | 0 | No Claim | NaT | nan | nan | nan | nan |
| 4526179 | 20181027 | 47543 | 105.000000 | 59 | 48 | Private Label | 8 | low carb | 2019-09-30 00:00:00 | walmart | 42.000000 | 40.000000 | 2019.000000 |
| 4526180 | 20181027 | 47544 | 3720.000000 | 1246 | 4361 | Private Label | 0 | No Claim | NaT | nan | nan | nan | nan |
| 4526181 | 20181027 | 47545 | 1729.000000 | 2016 | 378 | Private Label | 227 | salmon | 2019-07-11 00:00:00 | walmart | 42.000000 | 28.000000 | 2019.000000 |
| no_of_missing | imputed with | |
|---|---|---|
| date | 2796557 | Not Imputed |
| searchVolume | 2796557 | median |
| week_number | 2796557 | median |
| platform | 2796557 | mode |
No Infinity values
| original_type | new_columns | method | |
|---|---|---|---|
| Vendor | category | target_encoded_Vendor | target encoded |
| Claim_name | category | target_encoded_Claim_name | target encoded |
| platform | category | target_encoded_platform | target encoded |
| encoded values | |
|---|---|
| Vendor | |
| A | 61750.959917 |
| B | 39558.815754 |
| D | 48861.144854 |
| E | 86181.821483 |
| F | 30266.303846 |
| G | 86227.455719 |
| H | 22957.239303 |
| Others | 8444.867410 |
| Private Label | 11005.184886 |
| encoded values | |
|---|---|
| Claim_name | |
| No Claim | 28727.111414 |
| american gumbo | 5818.260216 |
| american southwest style | 15267.730213 |
| apple cinnamon | 15477.187623 |
| beans | 5928.162127 |
| beef hamburger | 812.664179 |
| blueberry | 14637.362056 |
| bone health | 3436.753788 |
| brown ale | 5033.345939 |
| buckwheat | 8388.825453 |
| cherry | 923.080275 |
| chicken | 5459.513713 |
| cocoa | 1071.791209 |
| convenience - easy-to-prepare | 2457.032690 |
| cookie | 23814.158296 |
| crab | 4815.734170 |
| energy/alertness | 1463.182359 |
| ethical - packaging | 16630.730241 |
| ethnic & exotic | 23146.678131 |
| french bisque | 10226.889405 |
| gingerbread | 6364.437770 |
| gmo free | 14665.686773 |
| halal | 4337.050725 |
| herbs | 15143.724621 |
| high/source of protein | 4259.732247 |
| low calorie | 44263.891973 |
| low carb | 9757.032854 |
| low sodium | 8064.472025 |
| low sugar | 5069.789390 |
| mackerel | 517.027397 |
| no additives/preservatives | 16500.746148 |
| nuts | 3056.388896 |
| peanut | 8599.744731 |
| pizza | 34696.640181 |
| pollock | 25076.922129 |
| poultry | 6058.704206 |
| prebiotic | 750.491803 |
| red raspberry | 764.764706 |
| salmon | 48528.062483 |
| scallop | 7099.241866 |
| soy foods | 57203.664029 |
| stroganoff | 37362.982937 |
| tilapia | 765.604413 |
| tuna | 2635.494596 |
| vegetarian | 3938.537104 |
| encoded values | |
|---|---|
| platform | |
| amazon | 16500.746148 |
| chewy | 4259.732247 |
| 10746.825390 | |
| walmart | 22655.396990 |
| Variable Name | No of Missing (out of 4526182) | Per of Missing | |
|---|---|---|---|
| 0 | date | 2796557 | 61.786225 |
| 1 | platform | 2796557 | 61.786225 |
| 2 | searchVolume | 2796557 | 61.786225 |
| 3 | week_number | 2796557 | 61.786225 |
| 4 | year_new | 2796557 | 61.786225 |
No duplicate variables
| Data Shape:(4526182, 13) | ||||||
|---|---|---|---|---|---|---|
| feature | < (mean-3*std) | > (mean+3*std) | < (1stQ - 1.5 * IQR) | > (3rdQ + 1.5 * IQR) | -inf | +inf |
| sales_dollars_value | 0 | 70655 | 0 | 631250 | 0 | 0 |
| sales_units_value | 0 | 47393 | 0 | 668926 | 0 | 0 |
| sales_lbs_value | 0 | 35562 | 0 | 715431 | 0 | 0 |
| Claim_id | 0 | 66482 | 0 | 810875 | 0 | 0 |
| searchVolume | 0 | 32416 | 0 | 204899 | 0 | 0 |
| week_number | 12896 | 0 | 0 | 0 | 0 | 0 |
| Variable Name | Datatype | No of Unique | Samples | Mean | Standard Deviation | Min | 25th percentile | Median | 75th percentile | Max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Claim_id | float64 | 44 | [0.0, 158.0, 227.0, 432.0, 185.0] | 63.160495 | 124.075774 | 0.000000 | 0.000000 | 8.000000 | 40.000000 | 435.715826 |
| 1 | product_id | float64 | 42616 | [1.0, 3.0, 4.0, 6.0, 7.0] | 28858.568988 | 15312.536560 | 1.000000 | 15069.000000 | 29981.000000 | 41513.000000 | 57317.000000 |
| 2 | sales_dollars_value | float64 | 254341 | [13927.0, 10289.0, 357.0, 23113.0, 23177.0] | 21594.541104 | 78180.565626 | 0.000000 | 523.000000 | 2655.000000 | 11765.000000 | 4395964.000000 |
| 3 | sales_lbs_value | float64 | 171749 | [18680.0, 28646.0, 440.0, 81088.0, 58164.0] | 12514.292899 | 47823.053345 | 0.000000 | 86.000000 | 611.000000 | 3770.000000 | 399173.765337 |
| 4 | sales_units_value | float64 | 71153 | [934.0, 1592.0, 22.0, 2027.0, 3231.0] | 3815.065102 | 11761.232981 | 1.000000 | 80.000000 | 403.000000 | 1807.000000 | 85720.276857 |
| 5 | searchVolume | float64 | 19 | [42.0, 41.0, 416.0, 82.0, 2737.6345175977913] | 75.472578 | 234.213209 | 2.000000 | 42.000000 | 42.000000 | 42.000000 | 2737.634518 |
| 6 | system_calendar_key_N | float64 | 196 | [20160109.0, 20160116.0, 20160123.0, 20160130.0, 20160206.0] | 20175054.752485 | 10735.371398 | 20160109.000000 | 20161231.000000 | 20171209.000000 | 20181103.000000 | 20191005.000000 |
| 7 | target_encoded_Claim_name | float64 | 45 | [28727.111413533636, 5459.513712624824, 48528.06248297776, 15477.187623235048, 23814.15829608204] | 21594.542805 | 10039.509580 | 517.027397 | 9757.032854 | 28727.111414 | 28727.111414 | 57203.664029 |
| 8 | target_encoded_Vendor | float64 | 9 | [8444.867410442677, 61750.95991743127, 11005.1848864566, 39558.81575434599, 86181.82148253488] | 21594.541105 | 19916.449630 | 8444.867410 | 8444.867410 | 11005.184886 | 39558.815754 | 86227.455719 |
| 9 | target_encoded_platform | float64 | 4 | [22655.396989619167, 16500.746148209884, 10746.82539023493, 4259.73224703279] | 21594.541103 | 2920.882736 | 4259.732247 | 22655.396990 | 22655.396990 | 22655.396990 | 22655.396990 |
| 10 | week_number | float64 | 14 | [40.0, 22.0, 28.0, 31.0, 36.0] | 37.755768 | 5.689954 | 9.804649 | 40.000000 | 40.000000 | 40.000000 | 40.000000 |
| Variable Name | Datatype | No of Unique | Samples | Mode | Mode Freq | first | last | Mode Freq % | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Claim_name | category | 45 | ['No Claim', 'chicken', 'salmon', 'apple cinnamon', 'cookie'] | No Claim | 2045703 | NaT | NaT | 45.197100 |
| 1 | Vendor | category | 9 | ['Others', 'A', 'Private Label', 'B', 'E'] | Others | 2195912 | NaT | NaT | 48.515769 |
| 2 | platform | category | 4 | ['walmart', 'amazon', 'google', 'chewy'] | walmart | 3933788 | NaT | NaT | 86.911839 |
| 3 | date | datetime64[ns] | 22 | [numpy.datetime64('NaT'), numpy.datetime64('2019-06-01T00:00:00.000000000'), numpy.datetime64('2019-07-11T00:00:00.000000000'), numpy.datetime64('2019-08-04T00:00:00.000000000'), numpy.datetime64('2019-09-02T00:00:00.000000000')] | 2019-09-30 00:00:00 | 907287 | 2019-01-04 00:00:00 | 2019-09-30 00:00:00 | 52.455706 |
| Variable 1 | Variable 2 | Corr Coef | Abs Corr Coef | |
|---|---|---|---|---|
| 0 | sales_dollars_value | sales_lbs_value | 0.778679 | 0.778679 |
| 1 | sales_lbs_value | sales_dollars_value | 0.778679 | 0.778679 |
| 2 | sales_dollars_value | sales_units_value | 0.554073 | 0.554073 |
| 3 | sales_units_value | sales_dollars_value | 0.554073 | 0.554073 |
| 4 | sales_dollars_value | target_encoded_Vendor | 0.254749 | 0.254749 |
| 5 | target_encoded_Vendor | sales_dollars_value | 0.254749 | 0.254749 |
| 6 | product_id | sales_dollars_value | -0.147133 | 0.147133 |
| 7 | sales_dollars_value | product_id | -0.147133 | 0.147133 |
| 8 | target_encoded_Claim_name | sales_dollars_value | 0.128414 | 0.128414 |
| 9 | sales_dollars_value | target_encoded_Claim_name | 0.128414 | 0.128414 |
| 10 | target_encoded_platform | sales_dollars_value | 0.037361 | 0.037361 |
| 11 | sales_dollars_value | target_encoded_platform | 0.037361 | 0.037361 |
| 12 | week_number | sales_dollars_value | 0.021904 | 0.021904 |
| 13 | sales_dollars_value | week_number | 0.021904 | 0.021904 |
| 14 | sales_dollars_value | searchVolume | -0.019538 | 0.019538 |
| 15 | searchVolume | sales_dollars_value | -0.019538 | 0.019538 |
| 16 | Claim_id | sales_dollars_value | -0.010143 | 0.010143 |
| 17 | sales_dollars_value | Claim_id | -0.010143 | 0.010143 |
| 18 | system_calendar_key_N | sales_dollars_value | -0.003643 | 0.003643 |
| 19 | sales_dollars_value | system_calendar_key_N | -0.003643 | 0.003643 |